31 research outputs found

    Learning to compress and search visual data in large-scale systems

    Full text link
    The problem of high-dimensional and large-scale representation of visual data is addressed from an unsupervised learning perspective. The emphasis is put on discrete representations, where the description length can be measured in bits and hence the model capacity can be controlled. The algorithmic infrastructure is developed based on the synthesis and analysis prior models whose rate-distortion properties, as well as capacity vs. sample complexity trade-offs are carefully optimized. These models are then extended to multi-layers, namely the RRQ and the ML-STC frameworks, where the latter is further evolved as a powerful deep neural network architecture with fast and sample-efficient training and discrete representations. For the developed algorithms, three important applications are developed. First, the problem of large-scale similarity search in retrieval systems is addressed, where a double-stage solution is proposed leading to faster query times and shorter database storage. Second, the problem of learned image compression is targeted, where the proposed models can capture more redundancies from the training images than the conventional compression codecs. Finally, the proposed algorithms are used to solve ill-posed inverse problems. In particular, the problems of image denoising and compressive sensing are addressed with promising results.Comment: PhD thesis dissertatio

    Privacy-Preserving Identification via Layered Sparse Code Design: Distributed Servers and Multiple Access Authorization

    Full text link
    We propose a new computationally efficient privacy-preserving identification framework based on layered sparse coding. The key idea of the proposed framework is a sparsifying transform learning with ambiguization, which consists of a trained linear map, a component-wise nonlinearity and a privacy amplification. We introduce a practical identification framework, which consists of two phases: public and private identification. The public untrusted server provides the fast search service based on the sparse privacy protected codebook stored at its side. The private trusted server or the local client application performs the refined accurate similarity search using the results of the public search and the layered sparse codebooks stored at its side. The private search is performed in the decoded domain and also the accuracy of private search is chosen based on the authorization level of the client. The efficiency of the proposed method is in computational complexity of encoding, decoding, "encryption" (ambiguization) and "decryption" (purification) as well as storage complexity of the codebooks.Comment: EUSIPCO 201

    Privacy-Preserving Image Sharing via Sparsifying Layers on Convolutional Groups

    Full text link
    We propose a practical framework to address the problem of privacy-aware image sharing in large-scale setups. We argue that, while compactness is always desired at scale, this need is more severe when trying to furthermore protect the privacy-sensitive content. We therefore encode images, such that, from one hand, representations are stored in the public domain without paying the huge cost of privacy protection, but ambiguated and hence leaking no discernible content from the images, unless a combinatorially-expensive guessing mechanism is available for the attacker. From the other hand, authorized users are provided with very compact keys that can easily be kept secure. This can be used to disambiguate and reconstruct faithfully the corresponding access-granted images. We achieve this with a convolutional autoencoder of our design, where feature maps are passed independently through sparsifying transformations, providing multiple compact codes, each responsible for reconstructing different attributes of the image. The framework is tested on a large-scale database of images with public implementation available.Comment: Accepted as an oral presentation for ICASSP 202

    DeepTOFSino:A deep learning model for synthesizing full-dose time-of-flight bin sinograms from their corresponding low-dose sinograms

    Get PDF
    Purpose: Reducing the injected activity and/or the scanning time is a desirable goal to minimize radiation exposure and maximize patients’ comfort. To achieve this goal, we developed a deep neural network (DNN) model for synthesizing full-dose (FD) time-of-flight (TOF) bin sinograms from their corresponding fast/low-dose (LD) TOF bin sinograms.Methods: Clinical brain PET/CT raw data of 140 normal and abnormal patients were employed to create LD and FD TOF bin sinograms. The LD TOF sinograms were created through 5% undersampling of FD list-mode PET data. The TOF sinograms were split into seven time bins (0, ±1, ±2, ±3). Residual network (ResNet) algorithms were trained separately to generate FD bins from LD bins. An extra ResNet model was trained to synthesize FD images from LD images to compare the performance of DNN in sinogram space (SS) vs implementation in image space (IS). Comprehensive quantitative and statistical analysis was performed to assess the performance of the proposed model using established quantitative metrics, including the peak signal-to-noise ratio (PSNR), structural similarity index metric (SSIM) region-wise standardized uptake value (SUV) bias and statistical analysis for 83 brain regions.Results: SSIM and PSNR values of 0.97 ± 0.01, 0.98 ± 0.01 and 33.70 ± 0.32, 39.36 ± 0.21 were obtained for IS and SS, respectively, compared to 0.86 ± 0.02and 31.12 ± 0.22 for reference LD images. The absolute average SUV bias was 0.96 ± 0.95% and 1.40 ± 0.72% for SS and IS implementations, respectively. The joint histogram analysis revealed the lowest mean square error (MSE) and highest correlation (R2 = 0.99, MSE = 0.019) was achieved by SS compared to IS (R2 = 0.97, MSE= 0.028). The Bland &amp; Altman analysis showed that the lowest SUV bias (-0.4%) and minimum variance (95% CI: -2.6%, +1.9%) were achieved by SS images. The voxel-wise t-test analysis revealed the presence of voxels with statistically significantly lower values in LD, IS, and SS images compared to FD images respectively.Conclusion: The results demonstrated that images reconstructed from the predicted TOF FD sinograms using the SS approach led to higher image quality and lower bias compared to images predicted from LD images.</p
    corecore